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Abstract Most research concerning the influence of 
network structure on phenomena taking place on the 
network focus on relationships between global statistics 
of the network structure and characteristic properties 
of those phenomena, even though local structure has 
a signiflcant effect on the dynamics of some phenom- 
ena. In the present paper, we propose a new analysis 
method for phenomena on networks based on a cate- 
gorization of nodes. First, local statistics such as the 
average path length and the clustering coefficient for 
a node are calculated and assigned to the respective 
node. Then, the nodes are categorized using the self- 
organizing map (SOM) algorithm. Characteristic prop- 
erties of the phenomena of interest are visualized for 
each category of nodes. The validity of our method is 
demonstrated using the results of two simulation mod- 
els. The proposed method is useful as a research tool 
to understand the behavior of networks, in particular, 
for the large-scale networks that existing visualization 
techniques cannot work well. 

Keywords Complex Network • Multi- Agent Simula- 
tion • Data Mining • Visualization 



1 Introduction 

Many phenomena in the real world have been stud- 
ied with respect to the network structure behind them 
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(Boccaletti et al. 2006). Numerical experiments with 



mathematical models to simulate such phenomena on 
networks are performed in most such studies. These 
networks are generated using some network model, in 
which each node and edge represents an agent and a re- 
lationship between the agents, respectively. The simula- 
tion proceeds by allowing the states of agents to evolve 
according to transition rules. 

The analyses of such simulations using complex net- 
works have revealed fundamental mechanisms of such 



phenomena as epidemic outbreaks ( Pastor- Satorras and 
VespignanTl |2001| [Moreno et aTj |2002| [Parshani et al. 



2010), decision making with respect to a social dilemma 



(Nowak et al., 2004 Tomochi, 2004 Tsukamoto and 



Shirayama[ 2010), and synchronization of interactive 



units (Gomez-Gardenes et al. 2007). This type of anal- 
ysis is also performed for some real networks such as 



the blogosphere(Cha et al. ) 



The analysis methods used in previous studies can 
mainly be classified into two types. One is to investi- 
gate the relationships between phenomena and the sta- 
tistical properties of the network structure, such as the 
average path length and the clustering coefficient. The 
mechanisms of the phenomena are analyzed on the basis 
of the global structure of the network. This analysis is 
from a macroscopic standpoint. The other type is based 
on the relationships between phenomena and the local 
characteristics of the nodes or edges, such as degree, 
node or edge betweenness, and the local clustering co- 
efficient. This type of analysis explains the mechanisms 
of phenomena from a microscopic perspective based on 
the important nodes or edges in the network. These 
two methods are often used simultaneously to examine 
the relationships between phenomena and the network 
structure. In most cases, it is obvious that these are in- 
sufficient to reveal the details of network phenomena. 
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since the former methods lacks any local perspective, 
and using the latter methods, it is difficult to associate 
the role of important nodes with the global dynamics of 
phenomena. One of the most important issues for both 
types of method is how to connect the influence of local 
structure on phenomena with the global dynamics. 

To address this issue, a visualization method in which 
the states of agents are visualized on the positions of 
nodes determined by a graph layout technique is some- 
times used to analyze phenomena. Such visualization 
enables intuitive analysis using local and global struc- 



tures of networks (Rosen et al. , 2011 Adnan et al. 2011 



Pham et al. , 2011). However, the many possible lay- 



outs for the same network make interpreting the re- 
sults difficult. In addition, as the number of nodes in- 
creases, the graph layout itself becomes more compli- 



cated (Van Ham and Wattenberg, 2008 Uchida and 



Shirayama, 2007), making the extraction of useful in- 



formation from a large-scale network visualization quite 
difficult. Another method to address the abovemen- 
tioned issue is to use the community structure of net- 
works. The community structure is that which connects 
the local structure with the global one, and it can pro- 



vide some mesoscopic perspective to the analysis ( New- 



2006a|b Saravanan et al.^|2011|). However, since 



the community structure depends on the definition of 
the community and the extraction method, it is possi- 
ble to extract different community structures from the 



same network ( Fort unato, 2010). In this way, an analysis 



method based on community structure has a difficulty 
similar to that of visualization methods with respect to 
the interpretation of the results. 

In the present paper, we propose a new analysis 
method for simulations using networks. Our method is 
based on the categorization of nodes. The nodes of the 
network being used in a simulation are categorized ac- 
cording to their local characteristics, and the simulation 
results for the network is visualized for each category 
of nodes. We apply our method to two simulations, and 
the validity of our method is discussed. 



2 Proposal Method 

2.1 Node Categorization 

2.1.1 Characteristic properties of nodes 

Several of the statistical properties of network struc- 
tures, average path length and the clustering coeffi- 
cient, are obtained by averaging over the local values 
at each node. Therefore, such properties which can be 
calculated for individual nodes are used for our node 
categorization. 



First, we define the property of nodes as a multivari- 
ate variable n. The variable n is composed of the degree, 
the average degree of neighboring nodes, the node be- 
tweenness, the average path length, and the clustering 
coefficient. 

Let N be the number of nodes in the network. The 
z-th node of the network is denoted by Viii = 1, .., A/"). 
The degree of the node Vi is denoted by ki. The average 
degree of neighboring nodes of Vi is denoted by /c^^ and 
defined as the average degree of nodes linked to Vi. 

The node betweenness of Vi is denoted by hi and de- 
fined as the proportion of shortest paths between other 
pairs which include Vi. hi is calculated as follows. Let Vi^ 
and Vi^ be the start and terminal nodes, respectively. 



^N 



bi = 






9i 



(Ar-l)(A^-2)/2 



(1) 



where Qi^ is the number of shortest paths between 
Vi^ and Vi^ via Vi^ Ni^i^ is the total number of shortest 
paths between Vi^ and v^^, and the denominator is a 
normalization factor. 

Let Li and Ci be the average path length and clus- 
tering coefficient of v^, respectively. Li is calculated by 



= E 



*7^i 



d{vi,Vj) 
N -I 



(2) 



where d{vi^Vj) is the length of the shortest path be- 
tween Vi and Vj. Ci is calculated by 



Ci 



E,. 



ki{ki - l)/2 



(3) 



where Ei is the total number of links existing between 
pairs of nodes adjacent to Vi. 

The property of node Vi^ multivariate variable n, is 
^i = (ki^kl^ri^hi^Li^Ci). Ui is calculated for each node 
and stored with the identifier of the network. 

2.1.2 Categorization method 

Systematical data mining operations are applied to the 
dataset which consists of n^(z = 1,...,A/') in order to 
categorize the nodes. 

In this paper, the nodes are categorized using the 
self-organizing map (SOM) algorithm. The results from 
applying the SOM algorithm are displayed on a two- 
dimensional lattice. Figure [l] is an example of a 5 by 
5 lattice. We consider each region to be a cell, which 
is identified as (X, Y) according to the axes shown in 
Figure [l] Let M denote the cell. Using the SOM algo- 
rithm, the N nodes are each assigned to one of the cells. 
Herein, each category corresponds to one of these cells 
(Figure [l]). 
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Fig. 1 Location of categories in the SOM 
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Fig. 2 Example of node categorization in a network 



Fig. 3 Example of a visualization 



each category, which contains agents with similar char- 
acteristics of the local network structure. 

Herein, pie charts are used to visualize the results. 
The areas of the pie charts are proportional to the rate 
of each state of the agents in each category. The pie 
charts are displayed at the spatial location of each cell 
in the SOM. Time variation of the proportion in each 
category is also visualized using pie charts. Figure [3] is 
an example of this visualization. 

Combining the heat maps which show the features 
of the categorized nodes with the pie charts which show 
the results of simulation, we analyze phenomena on the 
networks. 



3 Experiments and Results 

3.1 Simulation Models 



Figure [2] shows an example of node categorization in 
a network. The network is visualized by Pajek (Batagelj 



and Mrvar[ |2QQ3 ) using the Kamada-Kawai graph lay- 
out algorithm (Kamada and Kawai 1989). The nodes 



are colored corresponding to each category. 

The results of categorization is checked using heat 
maps for each component of n. The heat maps are gen- 
erated according to the cell average for each component, 
which are obtained for each cell by averaging over the 
nodes belonging to that cell. 



2.2 Visualization of Simulation Results 

In the present study, we are interested in simulations 
using networks. Each node and edge represents an agent 
and a relationship between the agents, respectively. The 
simulations proceed so that the state of an agent evolves 
by applying transition rules. 

The nodes in the network used for the simulation 
are categorized according to their local characteristics. 
We visualize the simulation results on the network for 



S.1.1 Generation of networks 



In this paper, the networks used in the two simulations 



are generated using the HK model (Holme and Kim 



2002) and the CNN model (Vazquez 2003). Each net 



work is composed of 10000 nodes (A^ = 10000). The 
average degree < A: > is set to be about 8. 

The HK model is that proposed by [Holme and Kim 
(2002). The network is generated by processes based 
on "growth of the network" , "preferential attachment" , 
and "triad formation". The generated network has a 
scale-free property in which the degree distribution fol- 
lows the power law. It has been shown that the cluster- 
ing coefficient tends to be high in such networks. 



The CNN model is that proposed by Vazquez ( 2003 ) . 



The network is generated by processes based on "growth 
of the network" and "change of potential links to real 
links". The generated network has a scale- free prop- 
erty. It has been shown that the clustering coefficient is 
high and the network becomes "assortative" , by which 
is meant that nodes with similar degrees tend to be 
linked to each other. 
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On these networks, both an epidemic propagation 
and a spatial prisoner's dilemma described below are 
examined. 

3.1.2 Epidemic propagation on networks 

We employ the SIR model for the epidemic propagation 
simulation. Each agent (node) takes one of three differ- 
ent states: S (susceptible or healthy), I (infectious), or 
R (removed, immunized, or dead). Each agent changes 
its state according to the states of neighboring agents. 
Let A and /i be the infection and recovery rates, re- 
spectively. The number of neighboring infectious agents 
is denoted by n(I). At the beginning, all the agents 
(nodes) are in the state S. Then, several agents are cho- 
sen randomly, and their states are changed to I. The 
simulation proceeds by the following process after each 
time increment dt^ where dt is small: 

(a) One agent is chosen randomly, 
(bl) If the chosen agent is in state S, its state changes to 

I with probability Xn{l)dt. 
(b2) If the chosen agent is in state I, its state changes to 

R with probability /j^dt. 
(b3) If the chosen agent is in state R, its state does not 
change. 
(c) Processes (a) and (b) are repeated N times. 

The processes ((a), (b), and (c)) are repeated until there 
are no agents in state I in the network. 

In this simulation, the number of infectious agents 
at the initial stage is set to be 10 and the following 
parameter values are used: A = 0.2, /i = 1 and dt = 
0.01. 

3.1.3 Spatial prisoner's dilemma 

The spatial prisoner's dilemma is a model for decision 
making with respect to a social dilemma. Each agent 
takes one of two strategies: C (cooperation) or D (de- 
fection). These two strategies are regarded as the states 
of the agents. 

First, each agent occupies a node of the generated 
network and has an equal probability of choosing co- 
operation or defection as an initial strategy. All agents 
simultaneously update their strategy as follows: 

(a) Each agent plays the prisoner's dilemma game with 
all neighboring agents and receives the resulting pay- 
off shown in Table fl] (T stands for the temptation 
of defection). 

(b) Each agent imitates the strategy of the wealthiest 
among its neighbors. If an agent has the highest pay- 
off among the neighbors, it retains its own strategy 
for the next iteration. 



Table 1 Payoff matrix of the spatial prisoner's dilemma 





Cooperator Defector 


Cooperator 


1,1 0, T 


Defector 


T,0 e, 6 



In this simulation, T = 1.5, and e = 0, as in the work 



of Nowak and May (1992). 



3.2 Node Categorizations 

First, the property n^ = (/c^, /c^^, 6^, L^, Q) of the node 
Vi is calculated for each node of the networks generated 
by the HK model and CNN model. Then, the nodes are 
categorized into the 5 by 5 lattice shown in Figure [l] 
using the SOM algorithm. 

Next, the heat maps for each component of the prop- 
erty n are created. The heat maps for the networks 
generated by the HK and CNN models are shown in 
Figures [4] and [5) respectively. In these figures, the num- 
ber of nodes in each category is shown in the upper left 
panel of each figure. 

For the network generated by HK model, each cat- 
egory is characterized as follows: 

- the degree k increases for increasing X. The maxi- 

mum value appears at (4,4). 

- the average degree of neighboring nodes knn increases 

for increasing F, and, for fixed F, the maximum 
values is in the central X region. 

- the node betweenness b increases for both increasing 

X and increasing Y. The maximum value is at (4, 4) 
and is significantly larger than the values for other 
categories. 

- the average path length L decreases for both increas- 

ing X and increasing Y. Compared to 6, L varies 
little from category to category. 

- the clustering coefficient C decreases for increasing 

X. 

For the CNN model, each category is characterized 
as follows: 

- k decreases for increasing X. 

- knn decreases (increases) for increasing X (Y). 

- b decreases for increasing X. 

- L increases (decreases) for increasing X {Y). 

- C increases for both increasing X and increasing Y . 



3.3 Simulation of Epidemic Propagation 

SIR states over time for the networks generated using 
the HK and CNN models are shown in Figures |6] and 
[7| respectively. 
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Fig. 4 Heat maps for the HK model 



For the HK model, Figure [6] shows that the initial 
outbreaks start around (4,4) at t = 3.0. Since k is also 
largest in this category (Figure |4|, the model indicates 
epidemic outbreaks start near hubs, after which the 
epidemic expands. From t = 3.0 to 4.0, the epidemic 
spreads from category (3, 4) to (0, 4) in the negative X 
direction and from category (4, 3) to (4, 1) in the nega- 
tive Y direction. Referring to Figure [4J these categories 
coincide with those which have large k or knn , and short 
L. After t = 4, the epidemic seems to spread toward 
category (0,0). This means that the speed of propaga- 
tion of the epidemic within categories with smaller k 
and knn and longer L is relatively slow. At the termi- 
nal state {t = 16.78), the pattern of the distribution of 
proportions of infected agents, which is shown by the 
distribution of state R, is almost the same as that of b 
in Figure |4] From this, one can conclude that agents on 
the shortest paths tend to transmit the epidemic. 

For the CNN model. Figure [7| shows that a violent 
outbreak occurs in the category (0,4) at t = 0.5. As 
shown in Figure [5J this category has the largest k and 
knn- Therefore, it is assumed that the mutual infections 
of agents in hubs causes a violent outbreak. After t = 
0.5, the epidemic spreads from category (0,4) toward 
(4, 0). The proportions of infected agents are quite small 



in categories near (4,0). As shown in Figure [sj these 
categories have small k and knn and long L. This means 
that the agents which have fewer links with and are 
distant from other agents have a lower probability of 
infection. This trend in the CNN model (Figure [7|is 
more obvious than that in the HK model(Figure|6|. 
We also found that the pattern of the distribution of 
the proportions of infected agents at the terminal state 
{t = 12.6) is almost same as those of knn and L shown 
in Figure [5) whereas, for the HK model, the pattern 
was similar to that of b. 



3.4 Simulation of Spatial Prisoner's Dilemma 

The variation over time of the distribution of coopera- 
tors and defectors within the network generated by the 
HK and CNN models were visualized using the pro- 
pose method. The visualized results for the HK model 
is shown in Figure [8J For simplicity, we show only the 
case in which cooperative agents are dominant. 

For the HK model, cooperators have increased in 
several categories hy t = 1, in particular, in categories 
(2,4), (3,4), and (4,4). Referring to Figure [4J the cat- 
egories (2,4) and (4,4) are characterized by having the 
largest knn and /c, respectively. From t = 2 to 6, the 
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Fig. 5 Heat maps for the CNN model 



defectors come to dominate the network. However, at 
t = 8 the cooperators increase once more in categories 
(2,4), (3,4), and (4,4). Later, the defectors again come 
to dominate, as shown at t = 13, but then the coop- 
erators increase again. At this stage, the increase in 
cooperators starts first in categories characterized by 
large knn- The speed of the increase is then faster in 
categories with relatively large k. Finally, the cooper- 
ators come to dominate almost the whole network, as 
shown at t = 21. At this time, defectors survive only in 
categories with small knn^ such as (0,0) and (0, 1). 

For the CNN model, the variation over time of the 
distribution of cooperators and defectors as observed 
using the proposed method is similar to that for the 
network generated by the HK model. Defectors dom- 
inate initially, and cooperators expand outward from 
categories with large knn^ starting from categories with 
large k. However, compared the results for the HK model, 
more defectors remain in categories with small k or 
large C. 

4 Conclusions 

In this paper, we proposed a new analysis method for 
phenomena on networks based on a categorization of 



nodes. First, local statistics such as the average path 
length and the clustering coefficient for a node are cal- 
culated and assigned to the respective node to be used 
as the property of the node, denoted by multivariate 
variable n. Then, the nodes are categorized by applying 
the SOM algorithm to sets of n. Characteristic proper- 
ties of some phenomena are visualized for each category. 
The results are easily displayed in a two-dimensional 
lattice composed of the categories, even for the large- 
scale networks that existing visualization techniques can- 
not work well. In our approach, the relationships be- 
tween the phenomena and the network structure are 
revealed by the transition of the states of agents among 
the categories. 

An epidemic propagation and a spatial prisoner's 
dilemma were examined using our method. In our anal- 
ysis of two simulations, it was shown that the hubs play 
important roles on transmitting the state. In the case 
of the epidemic propagation, we found that the epi- 
demic outbreak starts near hubs and continues by ex- 
pand outward. In the spatial prisoner's dilemma, the 
visualization showed that cooperators expand outward 
from the cooperative hubs after the defectors have come 
to be dominant in almost the whole rest of the net- 
work. Although these results have been reported in pre- 
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Fig. 6 SIR states over time for the network generated using 
the HK model 



vious studies (Moreno et al. , 2002 Tsukamoto and Shi 



rayama 2010), several new findings, such as that the 
agents in categories with large knn also promote the 
expansion of epidemic and cooperation, independent of 
the degree /c, and that, for epidemic propagation, the 
pattern of the distribution of proportions of infected 
agents R is almost the same as that of node betweenness 
b were both obtained by using the proposed method. In 
future work, we will apply our method to the other 
kinds of simulations using networks, and then show the 
criteria which must be satisfied before our method may 
be applied. 



Acknowledgements This work was partially supported by 
Grant-in- Aid for Scientific Research (B) (21300031). 



References 

M. Adnan, M. Nagi, K. Kianmehr, R. Tahboub, M. Rid- 
ley, and J. Rokne. Promoting where, when and what? 
an analysis of web logs by integrating data mining 
and social network techniques to guide ecommerce 



= (initial state) 



t = 0.5 







^ ^ ^ ^ ^ ^ 




^ ^ ^ ^ ^ J 



t=1.0 



t=1.5 









t = 2.0 



:3.0 











t= 12.6 (terminal state) 


A 


:=« 


s 


« 


e 


€• 


ii 



; s 



; I 



; R 



Fig. 7 SIR states over time for the network generated using 
the CNN model 



business promotions. Social Network Analysis and 
Mining, 1(3):173-185, 2011. 

V. Batagelj and A. Mrvar. Pajek - analysis and visual- 
ization of large networks. Lecture Notes in Computer 
Science, 2265:77-103, 2003. 

S. Boccaletti, V. Latora, Y. Moreno, M. Chavez, and 
D.U. Hwang. Complex networks: Structure and dy- 
namics. Physics reports, 424(4-5): 175-308, 2006. 
ISSN 0370-1573. 

M. Cha, J.A.N. Perez, and H. Haddadi. The spread of 
media content through blogs. Social Network Analy- 
sis and Mining. 

S. Fortunato. Community detection in graphs. Physics 
Reports, 486(3-5) :75-174, 2010. ISSN 0370-1573. 



Tomoyuki Yuasa, Susumu Shirayama 



J. Gomez-Gardenes, Y. Moreno, and A. Arenas. Paths 
to synchronization on complex networks. Physical 
review letters, 98(3):34101, 2007. ISSN 1079-7114. 

P. Holme and B.J. Kim. Growing scale-free networks 
with tunable clustering. Physical Review E, 65(2): 
26107, 2002. 

T. Kamada and S. Kawai. An algorithm for drawing 
general indirect graphs. Information processing let- 
ters, 31:7-15, 1989. 

Y. Moreno, R. Pastor-Satorras, and A. Vespignani. Epi- 
demic outbreaks in complex heterogeneous networks. 
The European Physical Journal B- Condensed Matter 
and Complex Systems, 26(4):521-529, 2002. 

M.E.J. Newman. Finding community structure in net- 
works using the eigenvectors of matrices. Physical 
Review E, 74(3):36104, 2006a. ISSN 1550-2376. 

M.E.J. Newman. Modularity and community structure 
in networks. Proceedings of the National Academy of 
Sciences, 103(23) :8577, 2006b. 

M.A. Nowak and R.M. May. Evolutionary games and 
spatial chaos. Nature, 359(6398):826-829, 1992. ISSN 
0028-0836. 

M.A. Nowak, A. Sasaki, C. Taylor, and D. Fudenberg. 
Emergence of cooperation and evolutionary stabil- 
ity in finite populations. Nature, 428(6983) :646-650, 
2004. ISSN 0028-0836. 

R. Parshani, S. Carmi, and S. Havlin. Epidemic Thresh- 
old for the Susceptible-Infectious-Susceptible Model 
on Random Networks. Physical Review Letters, 104 
(25):258701, 2010. 

R. Pastor-Satorras and A. Vespignani. Epidemic 
spreading in scale-free networks. Physical review let- 
ters, 86(14) :3200-3203, 2001. 

M.C. Pham, R. Klamma, and M. Jarke. Development 
of computer science disciplines: a social network anal- 
ysis approach. Social Network Analysis and Mining, 
l(4):321-340, 2011. 

D. Rosen, G.A. Barnett, and J.H. Kim. Social networks 
and online environments: when science and practice 
co-evolve. Social Network Analysis and Mining, 1(1): 
27-42, 2011. 

M. Saravanan, G. Prasad, S. Karishma, and D. Sug- 
anthi. Analyzing and labeling telecom communities 
using structural properties. Social Network Analysis 
and Mining, l(4):271-286, 2011. 

M. Tomochi. Defectors' niches: prisoner's dilemma 
game on disordered networks. Social Networks, 26 
(4):309-321, 2004. ISSN 0378-8733. 

Ei Tsukamoto and Susumu Shirayama. Infiuence of the 
variance of degree distributions on the evolution of 
cooperation in complex networks. Physica A: Sta- 
tistical Mechanics and its Applications, 389(3): 577 - 
586, 2010. 



M. Uchida and S. Shirayama. Formation of patterns 
from complex networks. Journal of Visualization, 10 
(3):253-255, 2007. ISSN 1343-8875. 

F. Van Ham and M. Wattenberg. Centrality based visu- 
alization of small world graphs. In Computer Graph- 
ics Forum, volume 27, pages 975-982. Wiley Online 
Library, 2008. 

A. Vazquez. Growing network with local rules: Pref- 
erential attachment, clustering hierarchy, and degree 
correlations. Physical Review E, 67(5):56104, 2003. 



A New Analysis Method for Simulations Using Node Categorizations 



Ya t = (initial state) 






t=l 






f=2 



^ 



^ t=3 






f = 6 






r=13 









f^l4 





r 1 














^ 1 

H 


H 

L 4 


► 1 






^^T^^ 
k^^^^ 


r 1 


f ^ 


^^^i 


r 1 

L J 




^ 1 




r ^^ 



r=15 



t=l6 






t=\l 









f = 21 (terminal state) 






; c 



Fig. 8 CD states over time for the network generated using 
the HK model 



